AITopics | Đông Hà

Collaborating Authors

Đông Hà

OverThink: Slowdown Attacks on Reasoning LLMs

Kumar, Abhinav, Roh, Jaechul, Naseh, Ali, Karpinska, Marzena, Iyyer, Mohit, Houmansadr, Amir, Bagdasarian, Eugene

arXiv.org Artificial IntelligenceFeb-5-2025

We increase overhead for applications that rely on reasoning LLMs-we force models to spend an amplified number of reasoning tokens, i.e., "overthink", to respond to the user query while providing contextually correct answers. The adversary performs an OVERTHINK attack by injecting decoy reasoning problems into the public content that is used by the reasoning LLM (e.g., for RAG applications) during inference time. Due to the nature of our decoy problems (e.g., a Markov Decision Process), modified texts do not violate safety guardrails. We evaluated our attack across closed-(OpenAI o1, o1-mini, o3-mini) and open-(DeepSeek R1) weights reasoning models on the FreshQA and SQuAD datasets. Our results show up to 18x slowdown on FreshQA dataset and 46x slowdown on SQuAD dataset. The attack also shows high transferability across models. To protect applications, we discuss and implement defenses leveraging LLM-based and system design approaches. Finally, we discuss societal, financial, and energy impacts of OVERTHINK attack which could amplify the costs for third-party applications operating reasoning models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.02542

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Asia > Vietnam > Quảng Trị Province > Đông Hà (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Prototype-Based Interpretability for Legal Citation Prediction

Luo, Chu Fei, Bhambhoria, Rohan, Dahan, Samuel, Zhu, Xiaodan

arXiv.org Artificial IntelligenceMay-25-2023

Deep learning has made significant progress in the past decade, and demonstrates potential to solve problems with extensive social impact. In high-stakes decision making areas such as law, experts often require interpretability for automatic systems to be utilized in practical settings. In this work, we attempt to address these requirements applied to the important problem of legal citation prediction (LCP). We design the task with parallels to the thought-process of lawyers, i.e., with reference to both precedents and legislative provisions. After initial experimental results, we refine the target citation predictions with the feedback of legal experts. Additionally, we introduce a prototype architecture to add interpretability, achieving strong performance while adhering to decision parameters used by lawyers. Our study builds on and leverages the state-of-the-art language processing models for law, while addressing vital considerations for high-stakes tasks with practical societal impact.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.1649

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
North America > Canada (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Law > Statutes (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Improved Sensor-Based Animal Behavior Classification Performance through Conditional Generative Adversarial Network

Zhao, Zhuqing, Ha, Dong, Damle, Abhishek, Dos, Barbara Roqueto, White, Robin, Ha, Sook

arXiv.org Artificial IntelligenceSep-6-2022

Many activity classifications segments data into fixed window size for feature extraction and classification. However, animal behaviors have various durations that do not match the predetermined window size. The dense labeling and dense prediction methods address this limitation by predicting labels for every point. Thus, by tracing the starting and ending points, we could know the time location and duration of all occurring activities. Still, the dense prediction could be noisy with misalignments problems. We modified the U-Net and Conditional Generative Adversarial Network (cGAN) with customized loss functions as a training strategy to reduce fragmentation and other misalignments. In cGAN, the discriminator and generator trained against each other like an adversarial competition. The generator produces dense predictions. The discriminator works as a high-level consistency check, in our case, pushing the generator to predict activities with reasonable duration. The model trained with cGAN shows better or comparable performance in the cow, pig, and UCI HAPT dataset. The cGAN-trained modified U-Net improved from 92.17% to 94.66% for the UCI HAPT dataset and from 90.85% to 93.18% for pig data compared to previous dense prediction work.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2209.03758

Country:

North America > United States > Virginia (0.04)
Asia > Vietnam > Quảng Trị Province > Đông Hà (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Sponge Examples: Energy-Latency Attacks on Neural Networks

Shumailov, Ilia, Zhao, Yiren, Bates, Daniel, Papernot, Nicolas, Mullins, Robert, Anderson, Ross

arXiv.org Machine LearningJun-5-2020

The high energy costs of neural network training and inference led to the use of acceleration hardware such as GPUs and TPUs. While this enabled us to train large-scale neural networks in datacenters and deploy them on edge devices, the focus so far is on average-case performance. In this work, we introduce a novel threat vector against neural networks whose energy consumption or decision latency are critical. We show how adversaries can exploit carefully crafted $\boldsymbol{sponge}~\boldsymbol{examples}$, which are inputs designed to maximise energy consumption and latency. We mount two variants of this attack on established vision and language models, increasing energy consumption by a factor of 10 to 200. Our attacks can also be used to delay decisions where a network has critical real-time performance, such as in perception for autonomous vehicles. We demonstrate the portability of our malicious inputs across CPUs and a variety of hardware accelerator chips including GPUs, and an ASIC simulator. We conclude by proposing a defense strategy which mitigates our attack by shifting the analysis of energy consumption in hardware from an average-case to a worst-case perspective.

energy consumption, neural network, sponge example, (13 more...)

arXiv.org Machine Learning

2006.03463

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Santa Clara County > Mountain View (0.04)
Asia > Vietnam > Quảng Trị Province > Đông Hà (0.04)

Genre: Research Report (1.00)

Industry:

Energy (1.00)
Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback